Citation: Qin, J., Kim, D., & Opfer, J. E. (2017). Varieties of numerical estimation: A unified framework. In G. Gunzelmann, A. Howes, T. 
Tenbrink, & E. J. Davelaar (Eds.). Proceedings of the 39th annual meeting of the cognitive science society (pp. 2943-2948). Austin, TX: Cognitive 


Science Society. https://cogsci.mindmodeling.org/2017/papers/0556/ 


Varieties of Numerical Estimation: A Unified Framework 


Jike Qin (qin.284@osu.edu) 
Department of Psychology, 1835 Neil Avenue 
Columbus, OH 43210 USA 


Dan Kim (kim.3839@osu.edu) 
Department of Psychology, 1835 Neil Avenue 
Columbus, OH 43210 USA 


John Opfer (opfer.7@osu.edu) 
Department of Psychology, 1835 Neil Avenue 
Columbus, OH 43210 USA 


Abstract 


There is an ongoing debate over the psychophysical functions 
that best fit human data from numerical estimation tasks. To 
test whether one psychophysical function could account for 
data across diverse tasks, we examined 40 kindergartners, 38 
first graders, 40 second graders and 40 adults’ estimates using 
two fully crossed 2 x 2 designs, crossing symbol (symbolic, 
non-symbolic) and boundedness (bounded, unbounded) on 
free number-line tasks (Experiment 1) and crossing the same 
factors on anchored number-line tasks (Experiment 2). This 
strategy yielded 4 novel tasks to assess the generalizability of 
the models. Across all 8 tasks, 90% of participants provided 
estimates better fit by a mixed log-linear model than other 
cognitive models, and the weight of the logarithmic 
component (A) decreased with age. After controlling for age, 
the weight of the logarithmic component (A) significantly 
predicted arithmetic skills, whereas parameters of other 
models failed to do so. Results suggest that the logarithmic- 
to-linear shift theory provides a unified account of numerical 
magnitude estimation and provides uniquely accurate 
predictions for mathematical proficiency. 
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Introduction 


In this paper, we aimed to address an ongoing debate on the 
psychophysical functions that link numbers to their 
magnitude estimates and to provide a unified framework for 
understanding seemingly-conflicting data from a variety of 
studies (Barth & Paladino, 2011; Cohen & Samecka, 2014; 
Opfer, Thompson, & Kim, 2016; Siegler & Opfer, 2003; 
Slusser, Santiago, & Barth, 2013). Specifically, we sought 
to test whether models that fit data from old research 
methods could accurately predict data from new methods 
that differed in small increments that were thought to be 
psychologically meaningful. Finally, we aimed to test 
whether models that best accounted for numerical 
magnitude estimates also provided the best predictors of 
educational outcomes. 

The classic theory about developmental change in 
numerical estimation is that the representation of numerical 
magnitudes follows a logarithmic-to-linear shift (Siegler & 
Opfer, 2003; Siegler, Thompson, & Opfer, 2009). This 


account was mostly based on a single version of the 
number-line task, which is the symbolic bounded free 
branch in the taxonomy in Figure 1. 
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Figure 1: Taxonomy of number-line tasks. Branches 
connected by solid lines were examined in previous studies. 
Ones connected with dashed lines are new. 


Two alternative accounts were recently proposed. One is 
the proportional-judgment account, claiming participants 
adopt proportion judgment strategies when estimating 
numerical magnitudes (Barth & Paladino, 2011; Slusser et 
al., 2013). The other is the measurement-skills account, 
claiming that data from number-line tasks arise from task- 
specific measurement skills (Cohen & Blanc-Goldhammer, 
2011; Cohen & Sarnecka, 2014). Like the classic theory, 
these accounts also depended only on specific sets of 
number-line tasks (symbolic bounded anchored for 
proportional-judgment account; symbolic unbounded free 
for measurement-skills account (see Figure 1). 


1. Symbolic vs. Non-symbolic 


One potentially important variable is whether numerical 
magnitudes are presented symbolically or non-symbolically. 
Most studies have focused only on the symbolic magnitude 
estimates, though with different psychophysical functions 
being proposed (Barth & Paladino, 2011; Cohen & 
Sarnecka, 2014; Opfer et al., 2016; Siegler & Opfer, 2003; 
Slusser et al., 2013). 


In contrast, when Dehaene et al. (2008) and Anobile et al. 
(2012) presented subjects with non-symbolic numeric 
magnitudes, they found that a mixed log-linear model 
(MLLM) provided a better fit to data than alternatives. 
Among these alternatives, however, the power models 
proposed by Slusser et al. (2013) and Cohen & Blanc- 
Goldhammer (2011) were not included. 

The MLLM consists of a logarithmic and a linear 
component, and is defined by the following equation: 

U 

y= o(a —A)x + aminco 
where Y represents an estimate of number X on a 0-U 
number line, @ a scaling factor and 4 a logarithmicity index 
that denotes the degree of logarithmic compression in 
estimates. The logarithmic component 4 returns to a value 
between 0 and 1. If estimation is perfectly linear, 2 
converges to 0, whereas the value of the logarithmicity 
index gets close to 1 as estimation shows more logarithmic 
compression. 


2. Bounded vs. Unbounded 


Another potentially important feature of number-line 
estimation is whether an upper endpoint is provided 
(bounded) or not (unbounded). Cohen and his colleagues 
have suggested that extensions of cyclic models (CPMs) 
provide best fitting models for estimates in the bounded 
condition and that scallop power models (SPMs) provides 
best fitting for estimates in the unbounded condition (Cohen 
& Blanc-Goldhammer, 2011; Cohen & Sarnecka, 2014). 
Though they did not include the mixed log-linear model of 
Dehaene et al. (2008) among the alternatives tested, Kim 
and Opfer (in press) found the MLLM was a better predictor 
of estimates than CPMs and SPMs for symbolic bounded 
free and unbounded free number-line tasks. 

In Kim and Opfer (in press)’s study, they modified a 
mixed cyclic power model (MCPM2) based on Cohen and 
Sarnecka (2014)’s extensions of cyclic models (CPMs) for 
the bounded task, which was given by the following 
equation: 

y = w, X SBCM + w, X 1CPM + w3 X 2CPM, 
where SBCM is a subtraction bias cyclic model, 1CPM and 
2CPM are cycle power models with two and three reference 
points used. Each of wi, w2 and w; indicates a weight of 
each variant of the cyclic model respectively. 

Also, based on Cohen and Blanc-Goldhammer (2011)’s 
study for unbounded tasks, they modified a mixed scallop 
model (MSPM), which was defined as: 

y= w, X 1SPM + wy, X 2SPM + wz X MSPM, 
where 1SPM denotes the single scallop model, 2SPM the 
dual scallop model, and MSPM the multiple scallop model. 


3. Free vs. Anchored 


A third potentially important variable is whether subjects 
are given the numeric magnitude of the half-way point on 
the number-line (anchored) or not (free). Slusser et al. 
(2013) showed 5- to 10-year-old children symbolic bounded 


anchored number lines and found children’s estimates were 
better fit by one of three adapted cyclical power models 
(CPMs) (Hollands & Dyre, 2000) than a simple linear 
model. Subsequent studies, however, found the MLLM 
provided a better fit to both symbolic bounded free and 
symbolic founded anchored number-line estimates than 
mixtures of the CPMs (Opfer et al., 2016), which was called 
MCPM1 in Kim and Opfer (in press)’s study. The equation 
is as follows: 

y = w, XOCPM + w, xX 1CPM + w; X 2CPM, 
where OCPM, 1CPM and 2CPM indicate different variant of 
the cyclic power model, and w;, w2 and w; their weight. 


The Current Study 


In this study, we manipulated all three variables 
orthogonally to systematically test the mixed log-linear 
model against its competitors on 4 previously examined 
tasks and 4 novel tasks. Thus, we tested all the branches 
shown in Figure 1, with symbolic bounded free (SBF), 
symbolic unbounded free (SUF), non-symbolic bounded 
free (NBF), non-symbolic unbounded free (NUF) tasks in 
Experiment 1 and symbolic bounded anchored (SBA), 
symbolic unbounded anchored (SUA), non-symbolic 
bounded anchored (NBA), non-symbolic unbounded 
anchored (NUA) tasks in Experiment 2. At the end of 
Experiment 2, we also administrated a battery of math tests, 
including addition and subtraction, to each subject to 
determine which model parameters best predicted addition 
and subtraction proficiency. This issue has educational 
significance, but it also tests the key cognitive process claim 
of the measurement-skills account, viz. that unbounded 
number-line estimates are easier than the bounded ones 
because they require addition skills rather than subtraction 
skills. 


Experiment 1: Free Numerical Estimation 


Methods 


Participants Participants were 40 kindergartners (M=5.98 
years; 47.5% female), 38 first-graders (M=7.13 years; 50% 
female), 40 second-graders (M=8.09 years; 57.5% female) 
and 40 adults (M=20.1 years; 50% female). 
Materials and procedure Participants were administered 
four different number-line tasks using a 2 (symbolic/non- 
symbolic) by 2 (bounded/unbounded) fully-crossed design 
Order of tasks was determined by a balanced Latin square. 

In symbolic conditions, participants were presented with 
20 number-lines, with a number on each endpoint of the 
line. The to-be-estimated numerals were evenly sampled 
from 0 to 30. On each trial, numbers were shown 2s 
followed by random-noise mask. In non-symbolic 
conditions, procedure was similar, except that endpoints of 
lines and to-be-estimated numbers were dot arrays. Sizes of 
dots were controlled on 50% of trials, while areas covered 
by dots were controlled on the other 50%. 

In bounded conditions, endpoints of the line were 0 and 
30 (symbolic condition) or 0 and 30 dots (non-symbolic 


condition). In the unbounded condition, endpoints were 0 
and 1 (symbolic condition) or 0 and | dots (non-symbolic 
condition). The instructions for the unbounded condition 
were taken from Cohen and Sarnecka (2014). 


Results 


1. Logarithmic-to-linear-shift theory accurately 
predicted median estimates and individual differences. 
We first fit median estimates for all four number-line tasks 
and age groups using MLLM. Across all tasks and age 


groups (Figure 2), fit of MLLM was very high (R? =.93 ~ 1). 


Analyses of the weight of logarithmic component (A) 
revealed that with age, estimates changed from logarithmic 
patterns to linear ones, with 2 decreasing from 
kindergartners to adults across all tasks (Figure 2). As 
expected, ’ in non-symbolic conditions was higher than in 
symbolic ones. Also, 4 in unbounded conditions were higher 
than in bounded ones regardless of symbolic format, which 
argues against the view that “the unbounded task requires 
less mathematical sophistication than the bounded task 
does” (Cohen and Sarnecka, 2014). To test whether 
individual performance revealed the same pattern, we 
computed 4 for individual participants’ data and conducted 
a mixed ANOVA, with symbolic format and boundedness 
as within-participant factors and age group as a between- 
participant factor. Results showed a main effect of symbolic 
format, F(1,154)=74.19, p<.001, boundedness, 
F(1,154)=86.32, p<.001, and age group, F(3,154)=39.08, 
p<.001. An interaction between symbolic format and 
boundedness, F'(1,154)=4.17, p<.05, indicated that the effect 
of symbols was greater for the bounded tasks. 

To test whether logarithmicity of estimates represented a 

stable pattern of individual differences, we correlated 
individual participants’ 4 among all tasks. Results showed 
that individual participant’s 4 among all the four number 
line tasks positively correlated (with correlation coefficient 
.70 (p<.001) between SBF and SUF tasks; .45 (p<.001) 
between SBF and NBF tasks; .35 (p<.001) between SBF 
and NUF tasks; .49 (p<.001) between SUF and NBF 
tasks; .39 (p<.001) between SUF and NUF tasks; and .54 
(p<.001) between NBF and NUF tasks). 
2. Model comparison. We next compared the fit of MLLM 
to that of its competitors: MLLM vs MCPM1 and MCPM2 
for the bounded conditions and MLLM vs MSPM for the 
unbounded ones. The proportion of individual children who 
were best fit by the mixed log-linear model (MLLM) using 
AICc was calculated. 

As illustrated in Table 1, estimates of 68% to 100% of 
participants were best fit by mixed log-linear model 
(MLLM) among all four free number line tasks. In the 
bounded condition, none of the MCPMs was the best fitting 
model for the majority of any age or task combination. In 
unbounded condition, we replicated the findings from Kim 
and Opfer (in press), with MLLM providing a better fit for 
100% of participants’ estimates compared to MSPM. 


K 1STGRADER 2ND GRADER ADULT 
30 
e 
i F aa Pa ; S 
: 5 
10 = 0.89 = 0.42 = 0.37 =0.17 
0 R*=0.93 R*=0.95 R* = 0.96 R* = 0.99 
30 


ESTIMATE 

Nm ow = 

So 60 0° 
dns dan 


das 


~=0 2.=0 


0 R? = 0.98 R*=1 
0 10 20 300 10 20 300 10 20 300 10 20 30 


ACTUAL 


Figure 2: Median estimates on 0-30 free number lines for 
different age groups. 


Table 1: Percent of participants best fit by MLLM for free 
number line tasks. K, kindergartners; 1, first graders; 2, 
second graders; A, adults. 


MLLM 
K 1 2 A All 
SBF 95 89 83 68 84 
NBF 95 95 93 90 93 
SUF 100 100 100 100 100 
NUF 100 100 100 100 100 


Table 2: Partial correlation between 4 in MLLM and math 
score after controlling for age across the free number line 
tasks. 


Partial correlation 


Addition Subtraction 
MLLM 
SBF i -40 *** Se es 
NBF A -.27 *** -.19 * 
SUF i -.36 *** -.24 ** 
NUF A -.17 * -.17 * 


Note. * p<.05, ** p<.01, *** p<.001 


3. Predicting the mathematical performance. We next 
conducted partial correlation analysis between individual 
participant’s addition and subtraction performance and the 


best-fitting parameter values from the models when 
controlling for age. The addition score was the sum score of 
simple and complex addition problems, and the subtraction 
score was the sum score of simple and complex subtraction 
problems. 

As shown in Table 2, the logarithmicity parameter 1 of 
the MLLM predicted both addition and _ subtraction 
performance across all tasks after controlling for age. In 
contrast, the correlations among the model parameters of the 
MLLM competitors were very small, inconsistent, and not 
expected by the theories that generated the models. 
Specifically, for bounded conditions, the negative 
correlation between the absolute value of fs-1 of the 
MCPMs and math performance was found in only a few of 
number line tasks, with the absolute value of Bocpm-1 of 
MCPM1 negatively correlating with addition for the 
symbolic bounded free number line task (7=-.18, p<.05), the 
absolute value of Bspcm-1 of MCPM2 negatively correlating 
with addition and subtraction for the non-symbolic bounded 
free task (r=-.17, p<.05 for addition; r=-.17, p<.05 for 
subtraction). Also, only the absolute value of s-1 in MCPM2 
negatively correlating with subtraction was found in 
symbolic bounded free task (r=-.17, p<.05). For the 
unbounded condition, the negative correlation between the 
absolute value of Baspm-1 in MSPM and addition was only 
found in symbolic unbounded free task (7=-.18, p<.05). 
These finding suggests that MLLM uniquely predicts math 
performance, regardless of tasks or age groups. 


Experiment 2: Anchored Numerical 
Estimation 


Methods 


Participants Participants in Experiment 2 were the same as 
in Experiment 1. 

Materials and procedure Participants received the same 2 
(symbolic/non-symbolic) by 2  (bounded/unbounded) 
number line tasks as in Experiment 1, except that 
information was given about the location of 15 (or 15 dots) 
in each of the four tasks. Order of tasks followed a Latin 
square. After that, 200 arithmetic problems were presented 
for participants to solve as quickly as possible: simple 
addition, simple subtraction, complex addition and complex 
subtraction. For simple addition problems, each of the 
addends was a one-digit number and the sum was no more 
than 10 (e.g., 5+3, 2+1). For simple subtraction problems, 
the difference was less than 10 and both minuend and 
subtrahend were one-digit numbers (e.g., 9-3, 8-2). For 
complex addition problems, sums were bigger than 10 but 
less than 30, and addends were one- or two-digit numbers 
(e.g., 4416, 14+15). For complex subtraction problems, 
differences were bigger than 10 but less than 30, with the 
minuend a two-digit number and the subtrahend one- or 
two-digit numbers (e.g., 16-5, 25-11). 


Results 


1. _Logarithmic-to-linear-shift theory accurately 
predicted median estimates and individual differences. 
We first fit the median estimates for all four number line 
tasks and age groups using MLLM. As shown in Figure 3, 
across all tasks and age groups, the fit of MLLM was 
uniformly high (R? = .93 ~ 1). Analyses of 4 revealed that 
with age, estimates changed from logarithmic patterns to 
linear ones, with 4 decreasing from kindergartners to adults 
(Figure 3). As with Experiment 1, 4 in non-symbolic 
conditions were higher than in symbolic ones, and A in 
unbounded conditions were higher than in bounded 
conditions regardless of symbol. We also computed A for 
individual participants’ data. The mixed ANOVA results 
again showed a main effect of symbolic format, F(1, 154) = 
83.17, p<.001, boundedness, F(1,154)=21.20, p<.001, and 
age group, F(3, 154) = 19.63, p<.001. 

To test whether the logarithmic-to-linear-shift theory 

could also capture individual differences, we correlated 
individual participant’s 2 among tasks. The results showed 
that individual participant’s 4 among all the four number 
line tasks positively correlated (with correlation coefficient 
.81 (p<.001) between SBA and SUA tasks; .61 (p<.001) 
between SBA and NBA tasks; .48 (p<.001) between SBA 
and NUA tasks; .54 (p<.001) between SUA and NBA tasks; 
43 (p<.001) between SUA and NUA tasks; and .61 
(p<.001) between NBA and NUA tasks. 
2. Model comparison. We next examined whether MLLM 
is the best model compared to other competitors. According 
to the previous studies (Cohen & Sarnecka, 2014; Opfer et 
al., 2016; Slusser et al., 2013), we compared the fit of 
MLLM, MCPM1, and MCPM2 on individual data for the 
bounded condition (which included SBA and NBA tasks). 
Since the unbounded anchored number-line tasks were new 
in this study, we compared the fit of all the four models for 
the unbounded condition (which included SUA and NUA 
tasks). The proportion of individual children who were best 
fit by the mixed log-linear model (MLLM) using AICc was 
calculated. 

As illustrated in Table 3, the estimates of 63% to 100% of 
participants were best fit by mixed log-linear model 
(MLLM) among all four anchored tasks across all age 
groups. Specifically, in the bounded condition, no matter 
what types of symbol were given, against to the proportional 
account and subtraction or division-skill account, none of 
the MCPMs was the best fitting model for the majority. In 
the unbounded condition, our results showed that estimates 
of 78% to 98% of participants were best fitting by MLLM 
when compared the fitting of all the four models. All these 
results suggest the logarithmic-to-linear-shift account for all 
the anchored numerical magnitude _ representation, 
regardless of boundedness or symbolic format. 

3. Predicting the mathematical performance. Similar 
with Experiment 1, we also conducted partial correlation 
analysis between individual participant’s addition and 
subtraction performance and the best-fitting parameter 
values from the models when controlling for age. As shown 
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Figure 3: Median estimates on 0-30 anchored number 
lines for different age groups. 


Table 3: Percent of participants best fit by MLLM for 
anchored number line tasks. K, kindergartners; 1, first 
graders; 2, second graders; A, adults. 


MLLM 
K 1 2 A All 
SBA 85 63 63 68 70 
NBA 93 97 100 93 96 
SUA 90 92 78 78 84 
NUA 98 ~—o97 95 93 96 


Table 4: Partial correlation between 4 in MLLM and math 
score after controlling for age across the anchored number 
line tasks. 


Partial correlation 


Addition Subtraction 
MLLM 
SBA A -.29 *** -.20 * 
NBA A~ -.22 ** -.15 
SUA A -.30 *** -22 ** 
NUA A -.24 ** -.20 * 


Note. * p<.05, ** p<.01, *** p<.001 


in Table 4, X in the MLLM predicted both addition and 
subtraction performance across almost all the anchored 
number line tasks after controlling for age. However, for 
bounded condition, the negative correlation between the 
absolute value of (s-1 of the MCPMs and math performance 
was only found for the symbolic bounded anchored task, 
with absolute value of focpy-l in MCPM2 negatively 
correlating with addition and subtraction (r=-.26, p<.001 for 


addition; 7=-.26, p<.001 for subtraction). The finding 
suggests that MLLM uniquely predict math performance, 
regardless of tasks or age groups. 


Discussion 


Our experiments indicate that the logarithmic-to-linear shift 
account provides a unified framework that can account for 
data coming from a broad array of numerical estimation 
tasks. Specifically, we found a mixed log-linear model 
provided the best fitting model for the vast majority (90%) 
of children and adults. This finding held regardless of 
whether the symbolic format was symbolic or non- 
symbolic, whether the task was bounded or unbounded, and 
whether an additional reference was given or not. These 
results replicate those reported in Opfer et al. (2016) and 
Kim and Opfer (in press), as well as extending them to 4 
novel number-line tasks. 

Our results also showed that the logarithmic weight (A) 
was not fixed, but depended on the developmental history 
and prior experiences of the subject, leading to lower i 
values from kindergartners to adults. These findings met the 
overarching principle of the logarithmic-to-linear shift 
theory, which holds that the representation of numerical 
magnitude will change from the logarithmic pattern to linear 
one with age and experience (Opfer et al., 2011; Opfer & 
Siegler, 2007; Siegler & Booth, 2004; Siegler & Opfer, 
2003; Thompson & Opfer, 2008). 

Finally, individual differences were stable across the eight 
tasks: children whose estimates were more logarithmic in 
one task were also more logarithmic in the other seven 
tasks, r(156)=.35 ~ .81, p<.001. This would not be expected 
if the eight tasks elicited radically different estimation 
strategies, and it suggests that the logarithmic-to-linear 


theory provides an accurate picture for mental 
representation of all kinds of numerical estimations. 
Implications for alternative accounts 

Broadly, our results undercut key claims of the 


proportion-judgment and measurement-skills accounts. A 
key claim of the proportion-judgment account is that 
developmental change involves a change in the degree of 
bias and use of implicit reference points. In this view, the 
degree of bias (8) was thought to gradually converge on 1, 
and more reference points would be utilized by the 
participants, “from an unbounded power to a one-cycle 
proportional to a two-cycle proportional version of the 
model” (Slusser et al., 2013, p.5). If these views were 
correct, the weights for 0-cyclic power model (w;) and 1- 
cyclic power model (w2) in MCPMI1 would be expected to 
decrease with age and the weight for 2-cyclic power model 
(w3) would be expected to increase —at the very least among 
the bounded tasks in Experiment 1 and 2. However, we 
found no support for this developmental pattern among any 
of our eight tasks. Additionally, there was no stable pattern 
of individual differences in the degree of bias and use of 
reference points. Given the relatively poor fits of these 
models, this lack of predictive power might not be 


surprising, but it does warrant caution about the 
psychological meaning of the parameter values. 

Our results also provide robust evidence against the 
measurements-skills account. First, according to Cohen and 
Sarnecka (2014), “the implicit addition needed for the 
unbounded task is less mathematically sophisticated than 
the implicit subtraction needed for the bounded task, 
[therefore] children should perform better on the unbounded 
task at a younger age” (p. 1643). Against this contention, we 
found greater accuracy for bounded than unbounded tasks 
regardless of age, symbolic format, or provision of anchors. 
Far from being easier, the unbounded tasks were more 
difficult and actually yielded the highest logarithmicity 
scores. Even more critically, the parameter values of the 
models associated with this account (subtraction and scallop 
bias) were thought to track general subtraction and addition 
skill. If so, one would expect them to predict subtraction and 
addition skill when subjects actually performed subtraction 
and addition. However, we found no evidence that this was 
the case. Again, given the relatively poor fits of these 
models, its lack of predictive power should not be 
surprising. 
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